AITopics | branch model

Collaborating Authors

branch model

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

An Empirical Study of Mamba-based Pedestrian Attribute Recognition

Wang, Xiao, Kong, Weizhe, Jin, Jiandong, Wang, Shiao, Gao, Ruichong, Ma, Qingchuan, Li, Chenglong, Tang, Jin

arXiv.org Artificial IntelligenceJul-14-2024

Current strong pedestrian attribute recognition models are developed based on Transformer networks, which are computationally heavy. Recently proposed models with linear complexity (e.g., Mamba) have garnered significant attention and have achieved a good balance between accuracy and computational cost across a variety of visual tasks. Relevant review articles also suggest that while these models can perform well on some pedestrian attribute recognition datasets, they are generally weaker than the corresponding Transformer models. To further tap into the potential of the novel Mamba architecture for PAR tasks, this paper designs and adapts Mamba into two typical PAR frameworks, i.e., the text-image fusion approach and pure vision Mamba multi-label recognition framework. It is found that interacting with attribute tags as additional input does not always lead to an improvement, specifically, Vim can be enhanced, but VMamba cannot. This paper further designs various hybrid Mamba-Transformer variants and conducts thorough experimental validations. These experimental results indicate that simply enhancing Mamba with a Transformer does not always lead to performance improvements but yields better results under certain settings. We hope this empirical study can further inspire research in Mamba for PAR, and even extend into the domain of multi-label recognition, through the design of these network structures and comprehensive experimentation. The source code of this work will be released at \url{https://github.com/Event-AHU/OpenPAR}

dataset, mamba, recognition, (15 more...)

arXiv.org Artificial Intelligence

2407.10374

Country:

Asia > China > Anhui Province > Hefei (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)

Genre:

Research Report (1.00)
Overview (0.86)

Industry: Leisure & Entertainment > Sports > Golf (0.54)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Vision > Image Understanding (0.66)

Add feedback

KoReA-SFL: Knowledge Replay-based Split Federated Learning Against Catastrophic Forgetting

Xia, Zeke, Hu, Ming, Yan, Dengke, Liu, Ruixuan, Li, Anran, Xie, Xiaofei, Chen, Mingsong

arXiv.org Artificial IntelligenceApr-19-2024

Although Split Federated Learning (SFL) is good at enabling knowledge sharing among resource-constrained clients, it suffers from the problem of low training accuracy due to the neglect of data heterogeneity and catastrophic forgetting. To address this issue, we propose a novel SFL approach named KoReA-SFL, which adopts a multi-model aggregation mechanism to alleviate gradient divergence caused by heterogeneous data and a knowledge replay strategy to deal with catastrophic forgetting. Specifically, in KoReA-SFL cloud servers (i.e., fed server and main server) maintain multiple branch model portions rather than a global portion for local training and an aggregated master-model portion for knowledge sharing among branch portions. To avoid catastrophic forgetting, the main server of KoReA-SFL selects multiple assistant devices for knowledge replay according to the training data distribution of each server-side branch-model portion. Experimental results obtained from non-IID and IID scenarios demonstrate that KoReA-SFL significantly outperforms conventional SFL methods (by up to 23.25\% test accuracy improvement).

korea-sfl, server, test accuracy, (12 more...)

arXiv.org Artificial Intelligence

2404.12846

Country:

Asia > China > Shanghai > Shanghai (0.05)
North America > United States > Virginia (0.04)
Asia > Singapore > Central Region > Singapore (0.04)

Genre: Research Report (0.50)

Industry:

Information Technology (0.46)
Education (0.35)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

GitFL: Adaptive Asynchronous Federated Learning using Version Control

Hu, Ming, Xia, Zeke, Yue, Zhihao, Xia, Jun, Huang, Yihao, Liu, Yang, Chen, Mingsong

arXiv.org Artificial IntelligenceNov-22-2022

As a promising distributed machine learning paradigm that enables collaborative training without compromising data privacy, Federated Learning (FL) has been increasingly used in AIoT (Artificial Intelligence of Things) design. However, due to the lack of efficient management of straggling devices, existing FL methods greatly suffer from the problems of low inference accuracy and long training time. Things become even worse when taking various uncertain factors (e.g., network delays, performance variances caused by process variation) existing in AIoT scenarios into account. To address this issue, this paper proposes a novel asynchronous FL framework named GitFL, whose implementation is inspired by the famous version control system Git. Unlike traditional FL, the cloud server of GitFL maintains a master model (i.e., the global model) together with a set of branch models indicating the trained local models committed by selected devices, where the master model is updated based on both all the pushed branch models and their version information, and only the branch models after the pull operation are dispatched to devices. By using our proposed Reinforcement Learning (RL)-based device selection mechanism, a pulled branch model with an older version will be more likely to be dispatched to a faster and less frequently selected device for the next round of local training. In this way, GitFL enables both effective control of model staleness and adaptive load balance of versioned models among straggling devices, thus avoiding the performance deterioration. Comprehensive experimental results on well-known models and datasets show that, compared with state-of-the-art asynchronous FL methods, GitFL can achieve up to 2.64X training acceleration and 7.88% inference accuracy improvements in various uncertain scenarios.

artificial intelligence, branch model, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2211.12049

Country:

Asia > Singapore (0.04)
Asia > China > Shanghai > Shanghai (0.04)

Genre: Research Report (0.50)

Industry: Information Technology > Security & Privacy (0.54)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback